Word Formation Approach to Noun Phrase Analysis for Thai
نویسندگان
چکیده
Noun phrase analysis is one of the most important components in Natural Language Processing (NLP) applications, such as information retrieval, extraction and categorization. For Thai, noun phrase analysis has unique problems, i.e., noun phrase boundary identification, noun phrase decomposition and its relation extraction, and core noun detection. Statistical and rule based Word formation is, then, proposed as a means of efficiently noun phrase analysis by reducing the possible variants of boundary identification both in local and global level. The comparison of NP analysis with and without word formation is approximately 90% and 78% respectively.
منابع مشابه
Do Heavy-NP Shift Phenomenon and Constituent Ordering in English Cause Sentence Processing Difficulty for EFL Learners?
Heavy-NP shift occurs when speakers prefer placing lengthy or “heavy” noun phrase direct objects in the clause-final position within a sentence rather than in the post-verbal position. Two experiments were conducted in this study, and their results suggested that having a long noun phrase affected the ordering of constituents (the noun phrase and prepositional phrase) by advanced Iranian EFL le...
متن کاملInvestigating Embedded Question Reuse in Question Answering
The investigation presented in this paper is a novel method in question answering (QA) that enables a QA system to gain performance through reuse of information in the answer to one question to answer another related question. Our analysis shows that a pair of question in a general open domain QA can have embedding relation through their mentions of noun phrase expressions. We present methods f...
متن کاملEquivalence in Technical Texts: The Case of Accounting Terms in English-Persian Dictionaries
Translating accounting documents, in general, and accounting terminology, in particular, is not a simple task, especially when the new terms keep created in pace with accounting developments. This study was carried out to find the most common and preferable ways to translate accounting terms from English into Persian. Also, an attempt was made to identify the frequently used patterns of word-fo...
متن کاملA Non-local Attachment Preference in the Production and Comprehension of ThaiRelative Clauses
In parsing, a phrase is more likely to be associated with an adjacent word than to a non-adjacent one. Instances of adjacency violation pose a challenge to researchers but also an opportunity to better understand how people process sentences and to improve parsing algorithms by, for example, suggesting new features that can be used in machine learning. We report corpus counts and reading-time d...
متن کاملClassifier Assignment By Corpus-Based Approach
This paper presents an algorithm for selecting an appropriate classifier word for a noun. In Thai language, it frequently happens that there is fluctuation in the choice of classifier for a given concrete noun, both from the point of view of the whole speech community and individual speakers. Basically, there is no exact rule for classifier selection. As far as we can do in the rule~based appro...
متن کامل